Good, old-fashioned image processing

Quentin Caudron, PhD

@QuentinCaudron

Data Scientist, CBRE - we're hiring at cbredev.com

"Oh, I'm going to analyse the images by hand. How else would you do it ?"

Early images were of fantastic quality, and our algorithms worked out of the box.

Our dataset is beautiful and self-consistent.

No parameter searching was needed; the image processing was lightning fast !

Manual labelling of images was both fast and fun.

In [2]:
# Load the data, and look for a hint of where to start first
widefield_image_file = "data/Sheep11-4x-77.jpg"
image = transform.rescale(io.imread(widefield_image_file), 0.25)

plot_image(image)
In [3]:
# Perform a colour deconvolution
heu = color.separate_stains(image, colour_deconv_matrix).astype(float)

plot_image(heu)
In [4]:
# Apply CLAHE on the haematoxylin channel
rescaled = exposure.rescale_intensity(heu[:, :, 1], out_range=(0, 1))
equalised_hist = exposure.equalize_adapthist(rescaled)

plot_image(np.dstack((image, equalised_hist)))
In [5]:
# Apply Gaussian blur
blurred = filters.gaussian(equalised_hist, 7)

plot_image(np.dstack((image, blurred)))
In [6]:
# Sigmoid transform for contrast
contrast = exposure.adjust_sigmoid(blurred, cutoff=0.6)

plot_image(np.dstack((image, contrast)))
In [7]:
# Apply an adaptive threshold
thresholded = contrast > filters.threshold_local(contrast, 351, offset=-0.1)

plot_image(np.dstack((image, thresholded)))
In [8]:
# Use a maximum filter followed by a binary closing to fill any holes
enlarged = maximum_filter(thresholded, 5)
inflammation = morphology.closing(enlarged, morphology.disk(11))

plot_image(np.dstack((image, inflammation)), titles=["Original", "Automated image processing"])
In [9]:
# Load up the human mask and visualise everything together
mask = transform.rescale(io.imread("data/Sheep11-4x-77.jpg_mask.jpg"), 0.25)

plot_image(np.dstack((image, mask, inflammation)), titles=["Original", "Human-masked", "Automatic"])
In [10]:
# Let's use a convolutional encoder-decoder network to find inflammatory zones
from keras.models import load_model

model = load_model("models/histo_convnet.hdf5")
model.summary()
Using TensorFlow backend.
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_206 (Conv2D)          (None, 432, 648, 32)      896       
_________________________________________________________________
conv2d_207 (Conv2D)          (None, 432, 648, 32)      9248      
_________________________________________________________________
conv2d_208 (Conv2D)          (None, 432, 648, 32)      9248      
_________________________________________________________________
max_pooling2d_37 (MaxPooling (None, 216, 324, 32)      0         
_________________________________________________________________
conv2d_209 (Conv2D)          (None, 216, 324, 64)      18496     
_________________________________________________________________
conv2d_210 (Conv2D)          (None, 216, 324, 64)      36928     
_________________________________________________________________
conv2d_211 (Conv2D)          (None, 216, 324, 64)      36928     
_________________________________________________________________
max_pooling2d_38 (MaxPooling (None, 108, 162, 64)      0         
_________________________________________________________________
conv2d_212 (Conv2D)          (None, 108, 162, 256)     147712    
_________________________________________________________________
conv2d_213 (Conv2D)          (None, 108, 162, 256)     590080    
_________________________________________________________________
max_pooling2d_39 (MaxPooling (None, 54, 81, 256)       0         
_________________________________________________________________
conv2d_214 (Conv2D)          (None, 54, 81, 512)       1180160   
_________________________________________________________________
conv2d_215 (Conv2D)          (None, 54, 81, 512)       2359808   
_________________________________________________________________
up_sampling2d_37 (UpSampling (None, 108, 162, 512)     0         
_________________________________________________________________
conv2d_216 (Conv2D)          (None, 108, 162, 256)     1179904   
_________________________________________________________________
conv2d_217 (Conv2D)          (None, 108, 162, 256)     590080    
_________________________________________________________________
up_sampling2d_38 (UpSampling (None, 216, 324, 256)     0         
_________________________________________________________________
conv2d_218 (Conv2D)          (None, 216, 324, 64)      147520    
_________________________________________________________________
conv2d_219 (Conv2D)          (None, 216, 324, 64)      36928     
_________________________________________________________________
conv2d_220 (Conv2D)          (None, 216, 324, 64)      36928     
_________________________________________________________________
up_sampling2d_39 (UpSampling (None, 432, 648, 64)      0         
_________________________________________________________________
conv2d_221 (Conv2D)          (None, 432, 648, 32)      18464     
_________________________________________________________________
conv2d_222 (Conv2D)          (None, 432, 648, 32)      9248      
_________________________________________________________________
conv2d_223 (Conv2D)          (None, 432, 648, 32)      9248      
_________________________________________________________________
conv2d_224 (Conv2D)          (None, 432, 648, 1)       289       
=================================================================
Total params: 6,418,113
Trainable params: 6,418,113
Non-trainable params: 0
_________________________________________________________________
In [11]:
# Use the convnet to find the inflammatory zones, and compare with our work so far
nn_image = transform.rescale(image, 0.5)
pred = model.predict(nn_image.reshape(1, *nn_image.shape)).squeeze()

plot_grid((image, mask, inflammation, pred > 0.5), ["Original", "Human-masked", "Classical", "ConvNet"])